Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation

نویسندگان

  • Osman Baskaya
  • David Jurgens
چکیده

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Sense Induction and Disambiguation Rivaling Supervised Methods

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context and successful approaches are known to benefit many applications in Natural Language Processing. Although, supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words...

متن کامل

Theme: A Study of Classifier Combination and Semi-Supervised Learning for Word Sense Disambiguation

1. Aims Word Sense Disambiguation (WSD) involves the association of a polysemous word in a text or discourse with a particular sense among numerous potential senses of that word. In my thesis, we present a study of classifier combination and semi-supervised learning for WSD, which aim to boost supervised WSD and improve accuracy of WSD. In addition, we also work on context representation and fe...

متن کامل

Word Sense Disambiguation by Semi-supervised Learning

In this paper we propose to use a semi-supervised learning algorithm to deal with word sense disambiguation problem. We evaluated a semi-supervised learning algorithm, local and global consistency algorithm, on widely used benchmark corpus for word sense disambiguation. This algorithm yields encouraging experimental results. It achieves better performance than orthodox supervised learning algor...

متن کامل

Word Sense Disambiguation Using Semi-Supervised Naive Bayes with Ontological Constraints

Background. Word sense disambiguation (WSD) is the task of mapping an ambiguous word to its correct sense given its context. As high-quality sensetagged data is scarce and expensive to obtain, attention has shifted from fullysupervised to semi-supervised and knowledge-based approaches to WSD that rely on a lexical knowledge base such as WordNet instead of large amounts of hand-labeled data. Wha...

متن کامل

Soft Word Sense Disambiguation

Word sense disambiguation is a core problem in many tasks related to language processing. In this paper, we introduce the notion of soft word sense disambiguation which states that given a word, the sense disambiguation system should not commit to a particular sense, but rather, to a set of senses which are not necessarily orthogonal or mutually exclusive. The senses of a word are expressed by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2016